{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1fe6e643-9453-4381-9445-bd471685fb96",
   "metadata": {},
   "source": [
    "## Exploring the Ethos dataset using Autolabel"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80110a5b-2b3e-45e2-a2da-f6fa00200dff",
   "metadata": {},
   "source": [
    "#### Setup the API Keys for providers that you want to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "92993c83-4473-4e05-9510-f543b070c7d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# provide your own OpenAI API key here\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c5b1630",
   "metadata": {},
   "source": [
    "#### Download the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "9382044e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from autolabel import get_data\n",
    "\n",
    "get_data(\"figure_extraction\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c72542ae",
   "metadata": {},
   "source": [
    "This downloads two datasets:\n",
    "* `test.csv`: This is the larger dataset we are trying to label using LLMs\n",
    "* `seed.csv`: This is a small dataset where we already have human-provided labels"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ffb92a79",
   "metadata": {},
   "source": [
    "## Start the labeling process!\n",
    "\n",
    "Labeling with Autolabel is a 3-step process:\n",
    "* First, we specify a labeling configuration (see `config.json` below)\n",
    "* Next, we do a dry-run on our dataset using the LLM specified in `config.json` by running `agent.plan`\n",
    "* Finally, we run the labeling with `agent.run`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84b014d1-f45c-4479-9acc-0d20870b1786",
   "metadata": {},
   "source": [
    "### First labeling run"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "c093fe91-3508-4140-8bd6-217034e3cce6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "from autolabel import LabelingAgent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c93fae0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# load the config\n",
    "with open(\"config_figure_extraction.json\") as f:\n",
    "     config = json.load(f)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "91dfc1f9",
   "metadata": {},
   "source": [
    "Let's review the configuration file below. You'll notice the following useful keys:\n",
    "* `task_type`: `attribute_extraction` (since it's an attribute extraction task)\n",
    "* `model`: `{'provider': 'openai', 'name': 'gpt-3.5-turbo'}` (use a specific OpenAI model)\n",
    "* `prompt.task_guidelines`: `'You are an expert at classifying hate speech and identifying...` (how we describe the task to the LLM)\n",
    "* `prompt.attributes`: `[\n",
    "            {\n",
    "                \"name\": \"violence\",\n",
    "                ...\n",
    "            },\n",
    "            {\n",
    "                \"name\": \"directed_vs_generalized\",\n",
    "                ...\n",
    "            },\n",
    "            {\n",
    "                \"name\": \"gender\",\n",
    "                ...\n",
    "            }\n",
    "        ]` (the full list of labels to choose from)\n",
    "* `prompt.few_shot_num`: 5 (how many labeled examples to provide to the LLM)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "450ad645",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'task_name': 'EthosAttributeExtraction',\n",
       " 'task_type': 'attribute_extraction',\n",
       " 'dataset': {'text_column': 'text',\n",
       "  'delimiter': ',',\n",
       "  'image_url_column': 'image_url'},\n",
       " 'model': {'provider': 'openai_vision', 'name': 'gpt-4-vision-preview'},\n",
       " 'prompt': {'task_guidelines': 'You are an expert at understanding plots and charts. Your job is to extract the following attributes from the provided image.',\n",
       "  'attributes': [{'name': 'caption',\n",
       "    'description': 'The caption associated with the provided plot, usually at the bottom of the image.'},\n",
       "   {'name': 'x_axis_name',\n",
       "    'description': 'The label given to the x axis of the plot'},\n",
       "   {'name': 'y_axis_name',\n",
       "    'description': 'The label given to the y axis of the plot'},\n",
       "   {'name': 'title', 'description': 'The title of the provided plot'}],\n",
       "  'example_template': 'Text: {text}\\nOutput: {output_dict}'}}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "acb4a3de-fa84-4b94-b17a-7a6fac892a1d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# create an agent for labeling\n",
    "agent = LabelingAgent(config=config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "92667a39",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "048e697aa6694b6a9adf18fc86da2da2",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┌──────────────────────────┬─────────┐\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Total Estimated Cost     </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> $0.0184 </span>│\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Number of Examples       </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> 1       </span>│\n",
       "│<span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\"> Average cost per example </span>│<span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\"> $0.0184 </span>│\n",
       "└──────────────────────────┴─────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┌──────────────────────────┬─────────┐\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mTotal Estimated Cost    \u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m$0.0184\u001b[0m\u001b[1;32m \u001b[0m│\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mNumber of Examples      \u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m1      \u001b[0m\u001b[1;32m \u001b[0m│\n",
       "│\u001b[1;35m \u001b[0m\u001b[1;35mAverage cost per example\u001b[0m\u001b[1;35m \u001b[0m│\u001b[1;32m \u001b[0m\u001b[1;32m$0.0184\u001b[0m\u001b[1;32m \u001b[0m│\n",
       "└──────────────────────────┴─────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #00ff00; text-decoration-color: #00ff00\">───────────────────────────────────────────────── </span>Prompt Example<span style=\"color: #00ff00; text-decoration-color: #00ff00\"> ──────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[92m───────────────────────────────────────────────── \u001b[0mPrompt Example\u001b[92m ──────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">{</span><span style=\"color: #008000; text-decoration-color: #008000\">\"text\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"You are an expert at understanding plots and charts. Your job is to extract the following attributes from</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">the provided image.\\n\\nYou will return the extracted attributes as a json with the following keys:\\n{\\n    </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">\\\"caption\\\": \\\"The caption associated with the provided plot, usually at the bottom of the image.\\\",\\n    </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">\\\"x_axis_name\\\": \\\"The label given to the x axis of the plot\\\",\\n    \\\"y_axis_name\\\": \\\"The label given to the y </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">axis of the plot\\\",\\n    \\\"title\\\": \\\"The title of the provided plot\\\"\\n}\\n\\nNow I want you to label the following </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">example:\\nText:  and title from the plot\\nOutput: \"</span>, <span style=\"color: #008000; text-decoration-color: #008000\">\"image_url\"</span>: \n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">\"https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\"</span><span style=\"font-weight: bold\">}</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1m{\u001b[0m\u001b[32m\"text\"\u001b[0m: \u001b[32m\"You are an expert at understanding plots and charts. Your job is to extract the following attributes from\u001b[0m\n",
       "\u001b[32mthe provided image.\\n\\nYou will return the extracted attributes as a json with the following keys:\\n\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\n    \u001b[0m\n",
       "\u001b[32m\\\"caption\\\": \\\"The caption associated with the provided plot, usually at the bottom of the image.\\\",\\n    \u001b[0m\n",
       "\u001b[32m\\\"x_axis_name\\\": \\\"The label given to the x axis of the plot\\\",\\n    \\\"y_axis_name\\\": \\\"The label given to the y \u001b[0m\n",
       "\u001b[32maxis of the plot\\\",\\n    \\\"title\\\": \\\"The title of the provided plot\\\"\\n\u001b[0m\u001b[32m}\u001b[0m\u001b[32m\\n\\nNow I want you to label the following \u001b[0m\n",
       "\u001b[32mexample:\\nText:  and title from the plot\\nOutput: \"\u001b[0m, \u001b[32m\"image_url\"\u001b[0m: \n",
       "\u001b[32m\"https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\"\u001b[0m\u001b[1m}\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #00ff00; text-decoration-color: #00ff00\">───────────────────────────────────────────────────────────────────────────────────────────────────────────────────</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[92m───────────────────────────────────────────────────────────────────────────────────────────────────────────────────\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# dry-run -- this tells us how much this will cost and shows an example prompt\n",
    "from autolabel import AutolabelDataset\n",
    "\n",
    "ds = AutolabelDataset(\"data/figure_extraction/test.csv\", config=config)\n",
    "agent.plan(ds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "dd703025-54d8-4349-b0d6-736d2380e966",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b9fab58322714f4ea6878044e4550612",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Actual Cost: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "Actual Cost: \u001b[1;36m0.0\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓\n",
       "┃<span style=\"font-weight: bold\"> captio… </span>┃<span style=\"font-weight: bold\"> capti… </span>┃<span style=\"font-weight: bold\"> captio… </span>┃<span style=\"font-weight: bold\"> x_axi… </span>┃<span style=\"font-weight: bold\"> x_axis… </span>┃<span style=\"font-weight: bold\"> x_axi… </span>┃<span style=\"font-weight: bold\"> y_axis… </span>┃<span style=\"font-weight: bold\"> y_axi… </span>┃<span style=\"font-weight: bold\"> y_axis… </span>┃<span style=\"font-weight: bold\"> title… </span>┃<span style=\"font-weight: bold\"> title:… </span>┃<span style=\"font-weight: bold\"> title… </span>┃\n",
       "┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1       </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0    </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0     </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1      </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0     </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0    </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1       </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0    </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0     </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1      </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0     </span>│<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\"> 1.0    </span>│\n",
       "└─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┓\n",
       "┃\u001b[1m \u001b[0m\u001b[1mcaptio…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mcapti…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mcaptio…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mx_axi…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mx_axis…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mx_axi…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1my_axis…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1my_axi…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1my_axis…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mtitle…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mtitle:…\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mtitle…\u001b[0m\u001b[1m \u001b[0m┃\n",
       "┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━┩\n",
       "│\u001b[1;36m \u001b[0m\u001b[1;36m1      \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0   \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0    \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1     \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0    \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0   \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1      \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0   \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0    \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1     \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0    \u001b[0m\u001b[1;36m \u001b[0m│\u001b[1;36m \u001b[0m\u001b[1;36m1.0   \u001b[0m\u001b[1;36m \u001b[0m│\n",
       "└─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# now, do the actual labeling\n",
    "ds = agent.run(ds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "74b2ac4b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "text  and title from the plot\n",
      "caption Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.\n",
      "x_axis_name Annotation size\n",
      "y_axis_name F1-score\n",
      "title FPB (Financial)\n",
      "image_url https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\n",
      "EthosAttributeExtraction_task_label {'caption': 'Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.', 'x_axis_name': 'Annotation size', 'y_axis_name': 'F1-score', 'title': 'FPB (Financial)'}\n",
      "caption_label Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.\n",
      "x_axis_name_label Annotation size\n",
      "y_axis_name_label F1-score\n",
      "title_label FPB (Financial)\n",
      "EthosAttributeExtraction_task_error None\n",
      "EthosAttributeExtraction_task_prompt {\"text\": \"You are an expert at understanding plots and charts. Your job is to extract the following attributes from the provided image.\\n\\nYou will return the extracted attributes as a json with the following keys:\\n{\\n    \\\"caption\\\": \\\"The caption associated with the provided plot, usually at the bottom of the image.\\\",\\n    \\\"x_axis_name\\\": \\\"The label given to the x axis of the plot\\\",\\n    \\\"y_axis_name\\\": \\\"The label given to the y axis of the plot\\\",\\n    \\\"title\\\": \\\"The title of the provided plot\\\"\\n}\\n\\nNow I want you to label the following example:\\nText:  and title from the plot\\nOutput: \", \"image_url\": \"https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\"}\n",
      "EthosAttributeExtraction_task_successfully_labeled True\n",
      "EthosAttributeExtraction_task_annotation successfully_labeled=True label={'caption': 'Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.', 'x_axis_name': 'Annotation size', 'y_axis_name': 'F1-score', 'title': 'FPB (Financial)'} curr_sample=b'\\x80\\x04\\x95\\r\\x02\\x00\\x00\\x00\\x00\\x00\\x00}\\x94(\\x8c\\x04text\\x94\\x8c\\x18 and title from the plot\\x94\\x8c\\x07caption\\x94X\\r\\x01\\x00\\x00Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.\\x94\\x8c\\x0bx_axis_name\\x94\\x8c\\x0fAnnotation size\\x94\\x8c\\x0by_axis_name\\x94\\x8c\\x08F1-score\\x94\\x8c\\x05title\\x94\\x8c\\x0fFPB (Financial)\\x94\\x8c\\timage_url\\x94\\x8cVhttps://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\\x94\\x8c\\x0boutput_dict\\x94\\x8c\\x00\\x94u.' confidence_score=None generation_info=None raw_response='{\"caption\": \"Figure 5: Comparisons between our multifidelity learning paradigm and single low-fidelity (all GPT-3.5) annotation on four domain-specific tasks given the same total 1000 annotation budget. Note that the samples for all GPT-3.5 are drawn based on the uncertainty score.\", \"x_axis_name\": \"Annotation size\", \"y_axis_name\": \"F1-score\", \"title\": \"FPB (Financial)\"}' explanation='' prompt='{\"text\": \"You are an expert at understanding plots and charts. Your job is to extract the following attributes from the provided image.\\\\n\\\\nYou will return the extracted attributes as a json with the following keys:\\\\n{\\\\n    \\\\\"caption\\\\\": \\\\\"The caption associated with the provided plot, usually at the bottom of the image.\\\\\",\\\\n    \\\\\"x_axis_name\\\\\": \\\\\"The label given to the x axis of the plot\\\\\",\\\\n    \\\\\"y_axis_name\\\\\": \\\\\"The label given to the y axis of the plot\\\\\",\\\\n    \\\\\"title\\\\\": \\\\\"The title of the provided plot\\\\\"\\\\n}\\\\n\\\\nNow I want you to label the following example:\\\\nText:  and title from the plot\\\\nOutput: \", \"image_url\": \"https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\"}' confidence_prompt='{\"text\": \"You are an expert at understanding plots and charts. Your job is to extract the following attributes from the provided image.\\\\n\\\\nYou will return the extracted attributes as a json with the following keys:\\\\n{\\\\n    \\\\\"caption\\\\\": \\\\\"The caption associated with the provided plot, usually at the bottom of the image.\\\\\",\\\\n    \\\\\"x_axis_name\\\\\": \\\\\"The label given to the x axis of the plot\\\\\",\\\\n    \\\\\"y_axis_name\\\\\": \\\\\"The label given to the y axis of the plot\\\\\",\\\\n    \\\\\"title\\\\\": \\\\\"The title of the provided plot\\\\\"\\\\n}\\\\n\\\\nNow I want you to label the following example:\\\\nText:  and title from the plot\\\\nOutput: \", \"image_url\": \"https://autolabel-benchmarking.s3.us-west-2.amazonaws.com/figure_extraction/image0.png\"}' input_tokens=175 output_tokens=95 cost=0 latency=0.0 error=None\n"
     ]
    }
   ],
   "source": [
    "for col in ds.df.columns:\n",
    "    print(col, ds.df.iloc[0][col])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.10.8 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.19"
  },
  "vscode": {
   "interpreter": {
    "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
